Frequency and time filtering of filter-bank energies for HMM speech recognition

نویسندگان

  • Climent Nadeu
  • José B. Mariño
  • Javier Hernando
  • Albino Nogueiras
چکیده

In speech recognition, a discriminative quefrency weighting can be achieved by somewhat decorrelating the frequency sequence of log mel-scaled filter-bank energies with a computationally inexpensive filter. In this paper, we show how the spectral parameters that result from this kind of frequency filtering, both alone and combined with filtering of their time trajectories, are competitive with respect to the conventional cepstral representations of speech signals.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Time and frequency filtering of filter-bank energies for robust HMM speech recognition

Every speech recognition system requires a signal representation that parametrically models the temporal evolution of the speech spectral envelope. Current parameterizations involve, either explicitly or implicitly, a set of energies from frequency bands which are often distributed in a mel scale. The computation of those energies is performed in diverse ways, but it always includes smoothing o...

متن کامل

Comparison and combination of RASTA-PLP and FF features in a hybrid HMM/MLP speech recognition system

Recently, the advantages of the spectral parameters obtained by frequency filtering (FF) of the logarithmic filter bank energies (logFBEs) have been reported. These parameters, which are frequency derivatives of the logFBEs, lie in the frequency domain, and have shown good recognition performance with respect to the conventional mel-frequency cepstral coefficients (MFCC) for HMM systems. In thi...

متن کامل

Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR

In current speech recognition systems, speech is represented by a 2-D sequence of parameters that model the temporal evolution of the spectral envelope of speech. Linear transformation or filtering along both time and frequency axes of that 2-D sequence are used to enhance the discriminative ability and robustness of speech parameters in the HMM pattern-matching formalism. In this paper, we com...

متن کامل

Improving the robustness of the usual FBE-based ASR front-end

All speech recognition systems require some form of signal representation that parametrically models the temporal evolution of the spectral envelope. Current parameterizations involve, either explicitly or implicitly, a set of energies from frequency bands which are often distributed in a mel scale. The computation of those filterbank energies (FBE) always includes smoothing of basic spectral m...

متن کامل

On the decorrelation of filter-bank energies in speech recognition

Cepstral coefficients are widely used in speech recognition. In this paper, we claim that they are not the best way of representing the spectral envelope, at least for some usual speech recognition systems. In fact, cepstrum has several disadvantages: poor physical meaning, need of transformation, and low capacity of adaptation to some recognition systems. In this paper, we propose a new repres...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996